340 research outputs found

    Constrained ordination analysis with enrichment of bell-shaped response functions

    Get PDF
    Constrained ordination methods aims at finding an environmental gradient along which the species abundances are maximally separated. The species response functions, which describe the expected abundance as a function of the environmental score, are according to the ecological fundamental niche theory only meaningful if they are bell-shaped. Many classical model-based ordination methods, however, use quadratic regression models without imposing the bell-shape and thus allowing for meaningless U-shaped response functions. The analysis output (e.g. a biplot) may therefore be potentially misleading and the conclusions are prone to errors. In this paper we present a log-likelihood ratio criterion with a penalisation term to enforce more bell-shaped response shapes. We report the results of a simulation study and apply our method to metagenomics data from microbial ecology

    Tests of fit for the logarithmic distribution

    Get PDF
    Smooth tests for the logarithmic distribution are compared with three tests: the first is a test due to Epps and is based on a probability generating function, the second is the Anderson-Darling test, and the third is due to Klar and is based on the empirical integrated distribution function. These tests all have substantially better power than the traditional Pearson-Fisher X2 test of fit for the logarithmic. These traditional chi-squared tests are the only logarithmic tests of fit commonly applied by ecologists and other scientists

    SPsimSeq : semi-parametric simulation of bulk and single-cell RNA-sequencing data

    Get PDF
    SPsimSeq is a semi-parametric simulation method to generate bulk and single-cell RNA-sequencing data. It is designed to simulate gene expression data with maximal retention of the characteristics of real data. It is reasonably flexible to accommodate a wide range of experimental scenarios, including different sample sizes, biological signals (differential expression) and confounding batch effects

    On the utility of RNA sample pooling to optimize cost and statistical power in RNA sequencing experiments

    Get PDF
    Background: In gene expression studies, RNA sample pooling is sometimes considered because of budget constraints or lack of sufficient input material. Using microarray technology, RNA sample pooling strategies have been reported to optimize both the cost of data generation as well as the statistical power for differential gene expression (DGE) analysis. For RNA sequencing, with its different quantitative output in terms of counts and tunable dynamic range, the adequacy and empirical validation of RNA sample pooling strategies have not yet been evaluated. In this study, we comprehensively assessed the utility of pooling strategies in RNA-seq experiments using empirical and simulated RNA-seq datasets. Result: The data generating model in pooled experiments is defined mathematically to evaluate the mean and variability of gene expression estimates. The model is further used to examine the trade-off between the statistical power of testing for DGE and the data generating costs. Empirical assessment of pooling strategies is done through analysis of RNA-seq datasets under various pooling and non-pooling experimental settings. Simulation study is also used to rank experimental scenarios with respect to the rate of false and true discoveries in DGE analysis. The results demonstrate that pooling strategies in RNA-seq studies can be both cost-effective and powerful when the number of pools, pool size and sequencing depth are optimally defined. Conclusion: For high within-group gene expression variability, small RNA sample pools are effective to reduce the variability and compensate for the loss of the number of replicates. Unlike the typical cost-saving strategies, such as reducing sequencing depth or number of RNA samples (replicates), an adequate pooling strategy is effective in maintaining the power of testing DGE for genes with low to medium abundance levels, along with a substantial reduction of the total cost of the experiment. In general, pooling RNA samples or pooling RNA samples in conjunction with moderate reduction of the sequencing depth can be good options to optimize the cost and maintain the power

    A method to search for optimal field allocations of transgenic maize in the context of co-existence

    Get PDF
    Spatially isolating genetically modified (GM) maize fields from non-GM maize fields is a robust on-farm measure to keep the adventitious presence of GM material in the harvest of neighboring fields due to cross-fertilizations below the European labeling threshold of 0.9%. However, the implementation of mandatory and rigid isolation perimeters can affect the farmers' freedom of choice to grow GM maize on their fields if neighboring farmers do not concur with their respective cropping intentions and crop plans. To minimize the presence of non-GM maize within isolation perimeters implemented around GM maize fields, a method was developed for optimally allocating GM maize to a particular set of fields. Using a Geographic Information System dataset and Monte Carlo analyses, three scenarios were tested in a maize cultivation area with a low maize share in Flanders (Belgium). It was assumed that some farmers would act in collaboration by sharing the allocation of all their arable land for the cultivation of GM maize. From the large number of possible allocations of GM maize to any field of the shared pool of arable land, the best field combinations were selected. Compared to a random allocation of GM maize, the best field combinations made it possible to reduce spatial co-existence problems, since at least two times less non-GM maize fields and their corresponding farmers occurred within the implemented isolation perimeters. In the selected field sets, the mean field size was always larger than the mean field size of the common pool of arable land. These preliminary data confirm that the optimal allocation of GM maize over the landscape might theoretically be a valuable option to facilitate the implementation of rigid isolation perimeters imposed by law.

    Goodness-of-fit tests based on sample space partitions : a unifying overview

    Get PDF
    Recently the authors have proposed tests for the one-sample and the κ-sample problem, and a test for independence. All three tests are based on sample space partitions, but they were originally developed in different papers. Here we give an overview of the construction of these tests, stressing the common underlying concept of “sample space partitions.

    SPECS: a non-parametric method to identify tissue-specific molecular features for unbalanced sample groups

    Get PDF
    Background To understand biology and differences among various tissues or cell types, one typically searches for molecular features that display characteristic abundance patterns. Several specificity metrics have been introduced to identify tissue-specific molecular features, but these either require an equal number of replicates per tissue or they can’t handle replicates at all. Results We describe a non-parametric specificity score that is compatible with unequal sample group sizes. To demonstrate its usefulness, the specificity score was calculated on all GTEx samples, detecting known and novel tissue-specific genes. A webtool was developed to browse these results for genes or tissues of interest. An example python implementation of SPECS is available at https://github.com/celineeveraert/SPECS. The precalculated SPECS results on the GTEx data are available through a user-friendly browser at specs.cmgg.be. Conclusions SPECS is a non-parametric method that identifies known and novel specific-expressed genes. In addition, SPECS could be adopted for other features and applications

    Small sample inference for probabilistic index models

    Get PDF
    Probabilistic index models may be used to generate classical and new rank tests, with the additional advantage of supplementing them with interpretable effect size measures. The popularity of rank tests for small sample inference makes probabilistic index models also natural candidates for small sample studies. However, at present, inference for such models relies on asymptotic theory that can deliver poor approximations of the sampling distribution if the sample size is rather small. A bias-reduced version of the bootstrap and adjusted jackknife empirical likelihood are explored. It is shown that their application leads to drastic improvements in small sample inference for probabilistic index models, justifying the use of such models for reliable and informative statistical inference in small sample studies
    • …
    corecore